
https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats/haats_documentation.pdf
How do we know when to go long/short Break-Even’s (Nominal Bonds vs ILBs) ?
What is the market-implied inflation rate?
How can we extract the term premium from using a theoretical/structural model?
How can we forecast both nominal and ilb yields in consistent/theoretically robust manner?
How can we extract and forecast market implied discount rates which price these securities?

But wait…. BE = Exp. Inf. + IRP
Suppose there several nominal bonds (N) and several inflation-linked ("real") bonds (R)
The no-arbitrage price of a zero-coupon bond with maturity $\tau$ is:
\begin{eqnarray*} P_{t}^{N}\{\tau\}&=&E_{t}\left[\frac{M_{t+\tau}^{N}}{M_{t}^{N}}\times1\right]=exp\left(-y_{t}^{N}\left\{ \tau\right\} \cdot\tau\right)\\&&\\P_{t}^{R}\{\tau\}&=&E_{t}\left[\frac{M_{t+\tau}^{R}}{M_{t}^{R}}\times1\right]=exp\left(-y_{t}^{R}\left\{ \tau\right\} \cdot\tau\right) \end{eqnarray*}Here, $y_{t}^{N}$ and $y_{t}^{R}$ are the nominal and real yields respectively. $M_{t}^{N}$ and $M_{t}^{N}$ are the nominal and real state price densities.
The SPDs follow
\begin{eqnarray*} \frac{dM_{t}^{R}}{M_{t}^{R}}&=&-r_{t}^{R}dt-\Gamma_{t}\cdot dW_{t}^{Q}\\&&\\\frac{dM_{t}^{N}}{M_{t}^{N}}&=&-r_{t}^{N}dt-\Gamma_{t}\cdot dW_{t}^{Q} \end{eqnarray*}For what follows, to simplify the notation, let us suppress the N and R superscripts.
The short rates and risk prices are assumed to be affined in the state variables. This is where we put the rabbit inside the hat...
\begin{eqnarray*} r_{t}&=&\rho_{0}+\rho_{1}\cdot X_{t} \\ \Gamma_{t}&=&\gamma_{0}+\gamma_{1}\cdot X_{t} \end{eqnarray*}Solving for B{t} and G{t} uniquely typically requires imposing some ad-hoc parameter values and other restrictions with little motivation.
Instead, following Christensen, Lopez, Rudebusch (2010) we assume a dynamic Nelson-Seigel (1987) model and impose level, slope and curvature restrictions by replacing the Nelson-Seigel (1987) parameters with the level, slope and curvature state variables.
This is sensible since there is a lot of evidence that Level, Slope, and Curvature (e.g. from PCA) explain the cross-section of government bonds. Furthermore, the data we will use to fit the model will itself be smoothe yields data from Nelson-Seigel-Svennson-type models.
Hence, the second key assumption concerns the state variables:
\begin{eqnarray*} X_{t}&=&\left(\begin{array}{c} L_{t}^{N}\\ S_{t}\\ C_{t}\\ L_{t}^{R} \end{array}\right) \\ dX_{t}&=&K^{P}\left(\theta^{P}-X_{t}\right)dt+\Sigma dW_{t} \end{eqnarray*}This implies
\begin{eqnarray*} y_{t}^{N}\{\tau\} &=& L_{t}^{N}+S_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}\right)+C_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}-e^{-\lambda\tau}\right)-\frac{G_{t}^{N}\left\{ \tau\right\} }{\tau} \\ y_{t}^{R}\{\tau\} &=& L_{t}^{R}+\alpha^{R}S_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}\right)+\alpha^{R}C_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}-e^{-\lambda\tau}\right)-\frac{G_{t}^{R}\left\{ \tau\right\} }{\tau} \end{eqnarray*}We observe the Nominal bond and ILB yields but the state variables (X) are hidden. We also need to estimate the model parameters.
First, we can re-write the key equations of the model...
Measurements: $$y_{t}=A_{0}+A_{1}X_{t}+\epsilon_{t} \qquad \text{with }\epsilon_{t}\sim N(0,\Phi)$$
States: $$X_{t}=U_0+U_1 X_{t_-1}+\eta_{t} \qquad \text{with } \eta_{t}\sim N(0,Q)$$
Since the Brownian Motion increments are Gaussian, Kalman-filtering is an efficient and consistent estimator. It also allows for asynchronous variables (crucial: if we later include observed Macro variables as state variables).
We use the Kalman filter along with Expectation-Maximization (EM) algorithm to jointly extract the hidden states and estimate the model parameters.
Note that $y_t$ is both the input and the target.
1) Start with a guess for the set of parameters, $\Omega$
2) Run the Kalman filter
3) Run the Kalman smoother
4) Compute the expected log-likelihood:
$$ \mathcal{\tilde{L}}\left\{ \Omega\right\} = E\left[ln\left(\left(\prod_{t=1}^{T}p\left(Y_{t}|X_{t},\Omega\right)\right)\left(\prod_{t=1}^{T}p\left(X_{t}|X_{t-1},\Omega\right)\right)\right)\bigg|X^{(T)}\right] $$5) Solve for $\hat{\Omega}$ $$\hat{\Omega} = arg\max_{\Omega}\mathcal{\tilde{L}}\left\{ \Omega\right\} $$
6) repeat steps 2-5 until change in $\hat{\Omega}$ is below a pre-specified tolerance level
4) choosing priors for $\Omega$ and
5) sampling from posterior distribution for $\Omega$
Importing (Nelson Siegel smoothed/fitted) yield data for TIPS and Nominal bonds...
Data source:
http://www.federalreserve.gov/econresdata/researchdata/feds200628.xls
https://www.federalreserve.gov/econresdata/researchdata/feds200805.xls
# The data is daily but let's focus of sub-sample of yrs with weekly frequency
tips_data, nominal_data = tips_data.loc['2004-01-01':'2016-05-30'],nominal_data.loc['2004-01-01':'2016-05-30']
tips_data, nominal_data = tips_data.asfreq('W', method='pad'), nominal_data.asfreq('W', method='pad')
fig = plt.figure();ax1= plt.subplot(121)
nominal_data.plot(ax=ax1)
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
ax2=plt.subplot(122,sharey=ax1)
tips_data.plot(ax=ax2)
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
fig.subplots_adjust(wspace=0.5);plt.show()
nominal_data[['Nominals_y01','Nominals_y05','Nominals_y10','Nominals_y30']].dropna(0).describe()
| Nominals_y01 | Nominals_y05 | Nominals_y10 | Nominals_y30 | |
|---|---|---|---|---|
| count | 646.000000 | 646.000000 | 646.000000 | 646.000000 |
| mean | 1.528356 | 2.492177 | 3.443919 | 4.082252 |
| std | 1.739018 | 1.298076 | 1.078136 | 0.782272 |
| min | 0.091700 | 0.617300 | 1.478600 | 2.351800 |
| 25% | 0.216300 | 1.498075 | 2.378675 | 3.320150 |
| 50% | 0.465700 | 2.096400 | 3.637900 | 4.287750 |
| 75% | 2.678700 | 3.659175 | 4.408500 | 4.697850 |
| max | 5.277600 | 5.110100 | 5.256600 | 5.824800 |
tips_data[['TIPS_y02','TIPS_y05','TIPS_y10','TIPS_y20']].dropna(0).describe()
| TIPS_y02 | TIPS_y05 | TIPS_y10 | TIPS_y20 | |
|---|---|---|---|---|
| count | 647.000000 | 647.000000 | 647.000000 | 647.000000 |
| mean | 0.119568 | 0.589371 | 1.194858 | 1.637228 |
| std | 1.460355 | 1.192668 | 0.957266 | 0.727053 |
| min | -2.202000 | -1.712700 | -0.830400 | 0.071500 |
| 25% | -1.049900 | -0.326300 | 0.495650 | 1.000800 |
| 50% | -0.128200 | 0.506500 | 1.358800 | 1.900000 |
| 75% | 0.976600 | 1.440500 | 1.959000 | 2.245800 |
| max | 5.159100 | 3.679600 | 3.577300 | 3.268200 |
There is a fair amout of correlation and auto-correlation.
The model can handle this (does not blow up). But, we may want to be more careful about which bonds we choose to include
plt.rc('text', usetex=False);fig = plt.figure(figsize=(20,8))
ax1=plt.subplot(121)
sns.heatmap( nominal_data.corr())
ax2=plt.subplot(122);plt.rc('text', usetex=False)
sns.heatmap( tips_data.corr())
fig.subplots_adjust(wspace=0.3);plt.show()
from itertools import cycle
fig = plt.figure(figsize=(16,6))
ax1=plt.subplot(121);lines = ["-","--","-.",":"];linecycler = cycle(lines)
for i in nominal_data.columns:
plt.acorr(nominal_data[[i]].dropna(0).iloc[:,0],maxlags=50, usevlines=False, linestyle=next(linecycler),marker=None,label=i )
plt.xlim(0,50)
plt.title('Nominal Bonds autocorrelations');plt.xlabel('lag')
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
ax2=plt.subplot(122,sharey=ax1);lines = ["-","--","-.",":"];linecycler = cycle(lines)
for i in tips_data.columns:
plt.acorr(tips_data[[i]].dropna(0).iloc[:,0],maxlags=50, usevlines=False, linestyle=next(linecycler),marker=None,label=i )
plt.xlim(0,50)
plt.title('TIPS autocorrelations');plt.xlabel('lag')
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
fig.subplots_adjust(wspace=0.5);plt.show()
# Instantiate estimation object:
estimation1 =Rolling()
# Set up data, etc.:
estimation1.run_setup(data, US_ilbmaturities, US_nominalmaturities, \
estim_freq=estim_freq, num_states=num_states,\
fix_Phi=fix_Phi, setdiag_Kp=setdiag_Kp, initV=initV)
fig = plt.figure(figsize=(14,8));ax1=plt.subplot(221);estimation1.fit_path['objective'].plot(ax=ax1)
plt.title('optimality: negative log-likelihood');plt.xlabel('outer iterations')
ax2 = plt.subplot(222);estimation1.fit_path_inner['sub_objective'].plot(ax=ax2)
plt.title('optimality: negative log-likelihood');plt.xlabel('inner iterations')
plt.ylim(ymax=estimation1.fit_path_inner['sub_objective'].iloc[1],ymin=estimation1.fit_path_inner['sub_objective'].min())
ax3 = plt.subplot(212);estimation1.fit_path['criteria'].plot()
plt.title('tolerance: maximum absolute change in parameters');fig.subplots_adjust(hspace=0.3);plt.show();
We do a decent job at fitting some bond yields but a not so good job at fitting others.
This can be improved upon by having more flexible model (more parameters) and a more robust optimization method.
linestyles = ['-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':']
plt.rc('text', usetex=True);fig = plt.figure(figsize=(16,8));ax1 = plt.subplot(121)
pd.concat([estimation1.Y.iloc[3:,0:4].rename(columns={i:str.replace(i,'_','\_') for i in estimation1.Y.columns.values},inplace=False),
estimation1.Ytt_new.iloc[3:,0:4].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k}$' for i in estimation1.Ytt_new.columns.values},inplace=False),
estimation1.Yttl_new.iloc[3:,0:4].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k-1}$' for i in estimation1.Yttl_new.columns.values},inplace=False)]
, axis=1).plot(ax=ax1,style=linestyles,linewidth=1)
plt.legend(loc='best',fontsize=9,frameon=0);plt.title('Measurements and Filtered Measurements');plt.axes.labelcolor='black'
linestyles = ['-', '--', '-.','-', '--', '-.', '-', '--', '-.','-', '--', '-.']
ax2 = plt.subplot(122)
pd.concat([estimation1.Y.iloc[3:,4:7].rename(columns={i:str.replace(i,'_','\_') for i in estimation1.Y.columns.values},inplace=False),
estimation1.Ytt_new.iloc[3:,4:7].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k}$' for i in estimation1.Ytt_new.columns.values},inplace=False),
estimation1.Yttl_new.iloc[3:,4:7].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k-1}$' for i in estimation1.Yttl_new.columns.values},inplace=False)]
, axis=1).plot(ax=ax2,style=linestyles,linewidth=1)
plt.legend(loc='best',fontsize=9,frameon=0);plt.title('Measurements and Filtered Measurements');plt.axes.labelcolor='black';plt.show();
linestyles = ['-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':']
plt.rc('text', usetex=True);fig = plt.figure(figsize=(16,8));ax1 = plt.subplot(121)
pd.concat(
[estimation1.Y.iloc[3:,7:11].rename(columns={i:str.replace(i,'_','\_') for i in estimation1.Y.columns.values},inplace=False),
estimation1.Ytt_new.iloc[3:,7:11].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k}$' for i in estimation1.Ytt_new.columns.values},inplace=False),
estimation1.Yttl_new.iloc[3:,7:11].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k-1}$' for i in estimation1.Yttl_new.columns.values},inplace=False)]
, axis=1).plot(ax=ax1,style=linestyles,linewidth=1)
plt.legend(loc='best',fontsize=9,frameon=0);plt.title('Measurements and Filtered Measurements');plt.axes.labelcolor='black'
linestyles = ['-', '--', '-.','-', '--', '-.','-', '--', '-.','-', '--', '-.'];ax2 = plt.subplot(122)
pd.concat(
[estimation1.Y.iloc[3:,11:].rename(columns={i:str.replace(i,'_','\_') for i in estimation1.Y.columns.values},inplace=False),
estimation1.Ytt_new.iloc[3:,11:].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k}$' for i in estimation1.Ytt_new.columns.values},inplace=False),
estimation1.Yttl_new.iloc[3:,11:].rename(columns={i:str.replace(i,'_','\_') +'$_{k|k-1}$' for i in estimation1.Yttl_new.columns.values},inplace=False)]
, axis=1).plot(ax=ax2,style=linestyles,linewidth=1)
plt.legend(loc='best',fontsize=9,frameon=0);plt.title('Measurements and Filtered Measurements')
plt.axes.labelcolor='black';plt.show()
plt.rc('text', usetex=False);fig = plt.figure(figsize=(14,7));ax1 = plt.subplot(221)
fit_se.iloc[:,0:7].plot(ax=ax1,linewidth=1 )
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
plt.title('Squared-Error for Nominal Bonds at each date',fontsize=12)
plt.axes.labelcolor='black'
ax2 = plt.subplot(222);fit_se.iloc[:,7:].plot(ax=ax2,linewidth=1 )
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
plt.title('Squared-Error for TIPS at each date',fontsize=12);plt.axes.labelcolor='black'
ax3 = plt.subplot(212)
fit_rmse_all.plot(ax=ax3,linewidth=1,kind='bar')
plt.legend(loc='best',fontsize=9,frameon=0)
plt.title('RMSE across all dates',fontsize=12)
plt.axes.labelcolor='black';fig.subplots_adjust(wspace=0.5,hspace=0.3);plt.show()
The first values for the filtered states is usually off... the value from the smoother are not...
linestyles = ['-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':'];plt.rc('text', usetex=True)
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
pd.concat(
[estimation1.XtT_new.rename(columns={i:i+'$_{smoother}$' for i in estimation1.Xtt_new.columns.values},inplace=False),
estimation1.Xtt_new.rename(columns={i:i+'$_{k|k}$' for i in estimation1.Xtt_new.columns.values},inplace=False),
estimation1.Xttl_new.rename(columns={i:i+'$_{k|k-1}$' for i in estimation1.Xttl_new.columns.values},inplace=False)]
, axis=1).plot(ax=figures['ax_fig2'],figsize=(8,8),style=linestyles,linewidth=1)
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Filtered States/Latent Variables');plt.axes.labelcolor='black';plt.show()
We are not yet confident in the implied inflation and deflation probabilities because the fit is not yet satisfactory...
Nontheless, to illustrate the machinery we plot the results below.
plt.rc('text', usetex=False);fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
estimation1.exp_inf[['horizon_1yr','horizon_2yr','horizon_5yr','horizon_10yr','horizon_30yr']].plot(ax=figures['ax_fig2'],
figsize=(6,6),linewidth=1)
plt.legend(loc='center left',fontsize=12,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Implied Expected Inflation')
plt.axes.labelcolor='black';plt.show()
plt.rc('text', usetex=False);fig, ax = plt.subplots(1);figures = {'fig2': fig, 'ax_fig2': ax}
estimation1.prob_def[['horizon_1yr','horizon_2yr','horizon_5yr','horizon_10yr','horizon_30yr']].plot(ax=figures['ax_fig2'],
figsize=(6,6),linewidth=1)
plt.legend(loc='center left',frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Implied Probability of Deflation')
plt.axes.labelcolor='black';plt.show()
In the forecast below, we are “cheating” a bit because the parameters and filtered states are obtained for a full-sample fit.
Instead what we should do is to use a rolling or expanding window to do the estimation and then forecast out of sample starting from the last date of the estimation window.
Nonetheless, for now let us examine some in-sample forecasts
Not surprisingly, these forecasts are rather poor given that the fit itself was poor...
fig, ax = sns.plt.subplots(1,figsize=(14,8))
line,=sns.plt.plot(forecast[['TIPS_y08']].iloc[np.arange(0, forecast.shape[0], estimation1.forecast_horizon+1)].set_index(\
forecast[['TIPS_y08']].iloc[np.arange(0, forecast.shape[0], estimation1.forecast_horizon+1)].index.get_level_values('date')),linewidth=1.5\
)
line.set_label('TIPS_y08: actual')
for t in forecast.index.get_level_values('date').unique():#loop and plot forecast for each horizon
line,=sns.plt.plot(
forecast[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].set_index(
forecast[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].index.get_level_values('horizon'))
,color='red',linewidth=0.5)
fill=plt.fill_between(forecast[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].index.get_level_values('horizon').values
,
forecast[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0]-
forecast_std[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0],
forecast[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0]+
forecast_std[['TIPS_y08']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0]
, facecolor='red', interpolate=True, alpha=.05
)
line.set_label('TIPS_y08: forecasts');fill.set_label('TIPS_y08: forecasts stderr')
sns.plt.legend(loc='best',fontsize=9);sns.plt.axes.labelcolor='black';sns.plt.show()
plt.rc('text', usetex=False);fig = plt.figure(figsize=(14,7));ax1 = plt.subplot(221)
forecast_rmse.iloc[:,0:7].plot(ax=ax1,linewidth=1)
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
plt.title('RMSE of forecasts for Nominal Bonds at each date',fontsize=12);plt.axes.labelcolor='black'
ax2 = plt.subplot(222);forecast_rmse.iloc[:,7:].plot(ax=ax2,linewidth=1 )
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
plt.title('RMSE of forecasts for TIPS at each date',fontsize=12);plt.axes.labelcolor='black'
ax3 = plt.subplot(212);forecast_rmse_all.plot(ax=ax3,linewidth=1,kind='bar' )
plt.legend(loc='best',fontsize=9,frameon=0);plt.title('RMSE of forecasts across all dates',fontsize=12);plt.axes.labelcolor='black';
fig.subplots_adjust(wspace=0.5,hspace=0.3);plt.show()
Detailed documentation/appendix can be found at https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats/haats_documentation.lyx https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats/haats_documentation.pdf
My Python code and be forked from https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats